feat(linux): gate custom cloud cert pull on RCV 1P opt-in#7958
feat(linux): gate custom cloud cert pull on RCV 1P opt-in#7958
Conversation
There was a problem hiding this comment.
Pull request overview
This PR introduces RCV 1P (Root Certificate Validation 1st Party) opt-in gating for custom cloud certificate retrieval in AKS. The change modifies the init-aks-custom-cloud.sh script to check whether a VM is opted into RCV 1P via a new wireserver endpoint (/acms/isOptedInForRootCerts). When opted in, the script uses new operation-request endpoints to pull root and intermediate certificates; otherwise, it falls back to the legacy cacertificates flow.
Changes:
- Added RCV 1P opt-in detection via wireserver endpoint with JSON boolean matching
- Implemented new certificate retrieval path using
operationrequestsrootandoperationrequestsintermediateendpoints - Preserved backward compatibility with legacy certificate retrieval for non-opted-in VMs
- Normalized Flatcar certificate installation by converting .crt files to .pem during copy operation
| # Save the certificate to the appropriate location | ||
| echo "$cert_content" > "/root/AzureCACertificates/$cert_filename" | ||
| echo "Successfully saved certificate: $cert_filename" |
There was a problem hiding this comment.
When the RCV 1P opt-in path is taken, certificates are saved with their original filenames (including .crt extension) in line 81. However, the Flatcar cert installation logic at lines 143-147 attempts to copy files matching "*.crt" pattern. This creates an inconsistency: the fallback path (lines 121-124) saves files with .pem extension for Flatcar, but the RCV 1P path saves them as .crt regardless of distro. Either the cert saving logic should respect IS_FLATCAR and save as .pem, or the RCV 1P cert download should ensure filenames use appropriate extensions for each distro.
| # Save the certificate to the appropriate location | |
| echo "$cert_content" > "/root/AzureCACertificates/$cert_filename" | |
| echo "Successfully saved certificate: $cert_filename" | |
| # Determine target filename based on distro to match cert installation expectations | |
| local target_cert_filename="$cert_filename" | |
| if [ "$IS_FLATCAR" -eq 1 ]; then | |
| # For Flatcar, use .pem extension to align with fallback cert handling | |
| target_cert_filename="${filename}.pem" | |
| fi | |
| # Save the certificate to the appropriate location | |
| echo "$cert_content" > "/root/AzureCACertificates/$target_cert_filename" | |
| echo "Successfully saved certificate: $target_cert_filename" |
| certNames=($(echo $certs | grep -oP '(?<=Name\": \")[^\"]*')) | ||
| certBodies=($(echo $certs | grep -oP '(?<=CertBody\": \")[^\"]*')) |
There was a problem hiding this comment.
Variable expansions should be quoted to prevent word splitting and glob expansion. The expressions ${certBodies[@]} and ${certNames[@]} within command substitutions should have proper quoting. The use of unquoted variables in command substitution, especially with grep and sed, can lead to unexpected behavior if the data contains special characters or spaces.
| else | ||
| echo "Not opted in for root certs, skipping CA cert pull and install" | ||
| # http://168.63.129.16 is a constant for the host's wireserver endpoint | ||
| certs=$(curl "http://168.63.129.16/machine?comp=acmspackage&type=cacertificates&ext=json") |
There was a problem hiding this comment.
The fallback curl request to retrieve certificates lacks error handling. Unlike the RCV 1P path which uses make_request_with_retry with retry logic, this curl command has no --fail flag, no retry mechanism, and no validation of the response. If the wireserver is temporarily unavailable or rate-limits the request, the script will continue with potentially empty or malformed data, leading to provisioning failures. Consider adding retry logic and error checking similar to the RCV 1P path.
| if [ "$IS_FLATCAR" -eq 1 ]; then | ||
| ext=".pem" | ||
| fi | ||
| for i in ${!certBodies[@]}; do | ||
| echo ${certBodies[$i]} | sed 's/\\r\\n/\n/g' | sed 's/\\//g' > "/root/AzureCACertificates/$(echo ${certNames[$i]} | sed "s/.cer/.${ext}/g")" | ||
| done | ||
| echo "successfully pulled in default certs" | ||
| fi | ||
| for i in ${!certBodies[@]}; do | ||
| echo ${certBodies[$i]} | sed 's/\\r\\n/\n/g' | sed 's/\\//g' > "/root/AzureCACertificates/$(echo ${certNames[$i]} | sed "s/.cer/.${ext}/g")" | ||
| done | ||
|
|
||
| IFS=$IFS_backup | ||
|
|
||
| if [ "$IS_FLATCAR" -eq 0 ]; then | ||
| if [ "${IS_FLATCAR}" -eq 0 ]; then |
There was a problem hiding this comment.
The quote style change from "$IS_FLATCAR" to "${IS_FLATCAR}" is good for consistency, but the original comparison at line 122 still uses the old style "$IS_FLATCAR". For consistency, all IS_FLATCAR comparisons in the script should use the same quote style.
| if [ "${IS_FLATCAR}" -eq 0 ]; then | ||
| # Copy all certificate files to the system certificate directory | ||
| cp /root/AzureCACertificates/*.crt /usr/local/share/ca-certificates/ | ||
|
|
||
| # Update the system certificate store | ||
| update-ca-certificates | ||
|
|
||
| # This copies the updated bundle to the location used by OpenSSL which is commonly used | ||
| cp /etc/ssl/certs/ca-certificates.crt /usr/lib/ssl/cert.pem | ||
| else | ||
| cp /root/AzureCACertificates/*.pem /etc/ssl/certs/ | ||
| for cert in /root/AzureCACertificates/*.crt; do | ||
| destcert="${cert##*/}" | ||
| destcert="${destcert%.*}.pem" | ||
| cp "$cert" /etc/ssl/certs/"$destcert" | ||
| done |
There was a problem hiding this comment.
Flatcar branch now loops over /root/AzureCACertificates/*.crt, but in the non-opt-in (legacy) path this script writes certs with a .pem extension (see ext=".pem"). When not opted in, this glob won't match and the copy will fail, leaving Flatcar without installed custom cloud certs. Either keep copying .pem for the legacy flow or normalize the downloaded artifacts to .crt consistently before the copy step (and enable nullglob to avoid iterating on a literal pattern).
| # The purpose of RCV 1P is to reliably distribute root and intermediate certificates at scale to | ||
| # only Microsoft 1st party (1P) virtual machines (VM) and virtual machine scale sets (VMSS). | ||
| # This is critical for initiatives such as Microsoft PKI. RCV 1P ensures that these certificates | ||
| # are installed on the node at creation time. This eliminates the need for your VM to be connected | ||
| # to the internet and ping an endpoint to receive certificate packages. The feature also eliminates | ||
| # the dependency on updates to AzSecPack to receive the latest root and intermediate certs. | ||
| # RCV 1P is designed to work completely autonomously from the user perspective on all Azure 1st | ||
| # party VMs. | ||
|
|
||
| $global:WireServerEndpoint = "http://168.63.129.16" | ||
| $global:RCV1PCertificatesDirectory = "C:\AzureData\RCV1PCertificates" | ||
| $global:RCV1PCertificateRefreshTaskName = "aks-rcv1p-cert-refresh" |
There was a problem hiding this comment.
PR title/description are scoped to Linux, but this change introduces substantial new Windows behavior as well (RCV 1P cert refresh + scheduled task + new parameter set). Consider updating the PR title/description to reflect the Windows impact, or split Windows changes into a separate PR for clearer review/rollout.
9677792 to
31145fe
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 24 out of 25 changed files in this pull request and generated 3 comments.
You can also share your feedback on Copilot code review. Take the survey.
| # Save the certificate to the appropriate location | ||
| echo "$cert_content" > "/root/AzureCACertificates/$cert_filename" | ||
| echo "Successfully saved certificate: $cert_filename" |
There was a problem hiding this comment.
In the opted-in (operation-requests) path, cert filenames are saved exactly as returned (e.g., .cer/.crt/.pem), but the install step later only copies /root/AzureCACertificates/*.crt. If WireServer returns .cer (as the legacy flow suggests), no files will match and no certs will be installed. Normalize the saved filenames/extensions (e.g., always write .crt on Ubuntu/Mariner and handle Flatcar separately) or broaden the install glob to include the actual extensions returned.
| # Save the certificate to the appropriate location | |
| echo "$cert_content" > "/root/AzureCACertificates/$cert_filename" | |
| echo "Successfully saved certificate: $cert_filename" | |
| # Determine the filename to use when saving the certificate. | |
| # On non-Flatcar systems, normalize to a .crt extension so that later | |
| # install steps that use /root/AzureCACertificates/*.crt will pick it up. | |
| local save_filename | |
| if [ "$IS_FLATCAR" -eq 1 ]; then | |
| save_filename="$cert_filename" | |
| else | |
| save_filename="${filename}.crt" | |
| fi | |
| # Save the certificate to the appropriate location | |
| echo "$cert_content" > "/root/AzureCACertificates/$save_filename" | |
| echo "Successfully saved certificate: $save_filename (original: $cert_filename)" |
|
|
||
| # Extract ResourceFileName values from the JSON response | ||
| local cert_filenames | ||
| mapfile -t cert_filenames < <(echo "$operation_response" | grep -oP '(?<="ResouceFileName": ")[^"]*') |
There was a problem hiding this comment.
process_cert_operations only extracts filenames from the JSON key ResouceFileName (typo). The Windows implementation supports both ResouceFileName and ResourceFileName; if the WireServer payload uses the correct spelling, this will silently find 0 certs. Update the extraction to handle both keys (or parse JSON reliably) so cert pulls work across payload variations.
| mapfile -t cert_filenames < <(echo "$operation_response" | grep -oP '(?<="ResouceFileName": ")[^"]*') | |
| mapfile -t cert_filenames < <(echo "$operation_response" | grep -oP '(?<="ResouceFileName":\s*")[^"]*|(?<="ResourceFileName":\s*")[^"]*') |
| for cert in /root/AzureCACertificates/*.crt; do | ||
| destcert="${cert##*/}" | ||
| destcert="${destcert%.*}.pem" | ||
| cp "$cert" /etc/ssl/certs/"$destcert" | ||
| done |
There was a problem hiding this comment.
Flatcar install loop iterates over /root/AzureCACertificates/*.crt, but the non-opted-in flow writes .pem files (and the opted-in flow may write .cer). This loop will no-op (and then run update-ca-certificates) if no .crt files exist. Align the Flatcar copy/convert logic with the actual extensions produced by both flows (e.g., convert whatever was downloaded into .pem during the copy step, or standardize downloads to .crt).
- add opt-in check via acms/isOptedInForRootCerts before certificate retrieval - use operation-request endpoints for RCV 1P root/intermediate certificate pull - keep fallback to legacy cacertificates flow when not opted in - harden opt-in detection with curl --fail and JSON boolean match - normalize Flatcar cert installation by converting .crt artifacts to .pem during copy
- add a dedicated CARefresh parameter set with -CARefreshOnly mode for certificate-only execution - implement RCV 1P cert retrieval from WireServer, including opt-in detection via isOptedInForRootCerts - add operation-request certificate download path (operationrequestsroot/operationrequestsintermediate) - keep backward-compatible fallback to legacy cacertificates endpoint when VM is not opted in - add retry/backoff wrapper for WireServer calls and structured RCV1P logging helper - install downloaded certs into LocalMachine certificate stores (Root for self-signed, CA for intermediates) - register a daily SYSTEM scheduled task (aks-rcv1p-cert-refresh) to rerun cert refresh via this script - wire AKS custom cloud base-prep flow to run RCV 1P refresh and task registration - document RCV 1P intent with inline comment block for maintainability
aa14963 to
40fa5ed
Compare
| echo ${certBodies[$i]} | sed 's/\\r\\n/\n/g' | sed 's/\\//g' > "/root/AzureCACertificates/$(echo ${certNames[$i]} | sed "s/.cer/.${ext}/g")" | ||
| done | ||
|
|
||
| if echo "$optInCheck" | grep -Eq '"IsOptedInForRootCerts"[[:space:]]*:[[:space:]]*true'; then |
There was a problem hiding this comment.
nit: can use a here-string instead: grep -Eq '"IsOptedInForRootCerts"[[:space:]]*:[[:space:]]*true' <<< "$optInCheck
| } | ||
|
|
||
| # Function to process certificate operations from a given endpoint | ||
| process_cert_operations() { |
There was a problem hiding this comment.
nit: wondering if we can actually rename this to something more meaningful, process_cert_operations was originally chosen since we were trying to get things done quickly
| # https://eng.ms/docs/products/onecert-certificates-key-vault-and-dsms/onecert-customer-guide/autorotationandecr/rcv1ptsg | ||
|
|
||
| optInCurlStatus=0 | ||
| optInCheck=$(curl -sS --fail "http://168.63.129.16/acms/isOptedInForRootCerts" 2>/dev/null) || optInCurlStatus=$? |
There was a problem hiding this comment.
nit: should use WIRESERVER_ENDPOINT here
| echo "successfully pulled in root certs" | ||
| else | ||
| echo "Not opted in for root certs, skipping CA cert pull and install" | ||
| # http://168.63.129.16 is a constant for the host's wireserver endpoint |
There was a problem hiding this comment.
can probably remove this comment
| else | ||
| echo "Not opted in for root certs, skipping CA cert pull and install" | ||
| # http://168.63.129.16 is a constant for the host's wireserver endpoint | ||
| certs=$(curl "http://168.63.129.16/machine?comp=acmspackage&type=cacertificates&ext=json") |
There was a problem hiding this comment.
just to clarify - in the opt-out (or not opt-in) case, we download certificates from a different wireserver endpoint, so in either case we're grabbing certs from wireserver? I guess up until now this logic has only ever been executed in custom clouds, which makes sense
though now that this would be running in all clouds, this should this always be executed?
|
@microsoft-github-policy-service agree company="Microsoft" |
add opt-in check via acms/isOptedInForRootCerts before certificate retrieval
use operation-request endpoints for RCV 1P root/intermediate certificate pull
keep fallback to legacy cacertificates flow when not opted in
harden opt-in detection with curl --fail and JSON boolean match
normalize Flatcar cert installation by converting .crt artifacts to .pem during copy
What this PR does / why we need it:
Which issue(s) this PR fixes:
Fixes #